268 research outputs found

    L0 Sparse Inverse Covariance Estimation

    Full text link
    Recently, there has been focus on penalized log-likelihood covariance estimation for sparse inverse covariance (precision) matrices. The penalty is responsible for inducing sparsity, and a very common choice is the convex l1l_1 norm. However, the best estimator performance is not always achieved with this penalty. The most natural sparsity promoting "norm" is the non-convex l0l_0 penalty but its lack of convexity has deterred its use in sparse maximum likelihood estimation. In this paper we consider non-convex l0l_0 penalized log-likelihood inverse covariance estimation and present a novel cyclic descent algorithm for its optimization. Convergence to a local minimizer is proved, which is highly non-trivial, and we demonstrate via simulations the reduced bias and superior quality of the l0l_0 penalty as compared to the l1l_1 penalty

    Regularized Block Toeplitz Covariance Matrix Estimation via Kronecker Product Expansions

    Full text link
    In this work we consider the estimation of spatio-temporal covariance matrices in the low sample non-Gaussian regime. We impose covariance structure in the form of a sum of Kronecker products decomposition (Tsiligkaridis et al. 2013, Greenewald et al. 2013) with diagonal correction (Greenewald et al.), which we refer to as DC-KronPCA, in the estimation of multiframe covariance matrices. This paper extends the approaches of (Tsiligkaridis et al.) in two directions. First, we modify the diagonally corrected method of (Greenewald et al.) to include a block Toeplitz constraint imposing temporal stationarity structure. Second, we improve the conditioning of the estimate in the very low sample regime by using Ledoit-Wolf type shrinkage regularization similar to (Chen, Hero et al. 2010). For improved robustness to heavy tailed distributions, we modify the KronPCA to incorporate robust shrinkage estimation (Chen, Hero et al. 2011). Results of numerical simulations establish benefits in terms of estimation MSE when compared to previous methods. Finally, we apply our methods to a real-world network spatio-temporal anomaly detection problem and achieve superior results.Comment: To appear at IEEE SSP 2014 4 page

    Decomposable Principal Component Analysis

    Full text link
    We consider principal component analysis (PCA) in decomposable Gaussian graphical models. We exploit the prior information in these models in order to distribute its computation. For this purpose, we reformulate the problem in the sparse inverse covariance (concentration) domain and solve the global eigenvalue problem using a sequence of local eigenvalue problems in each of the cliques of the decomposable graph. We demonstrate the application of our methodology in the context of decentralized anomaly detection in the Abilene backbone network. Based on the topology of the network, we propose an approximate statistical graphical model and distribute the computation of PCA

    Scalable Hash-Based Estimation of Divergence Measures

    Full text link
    We propose a scalable divergence estimation method based on hashing. Consider two continuous random variables XX and YY whose densities have bounded support. We consider a particular locality sensitive random hashing, and consider the ratio of samples in each hash bin having non-zero numbers of Y samples. We prove that the weighted average of these ratios over all of the hash bins converges to f-divergences between the two samples sets. We show that the proposed estimator is optimal in terms of both MSE rate and computational complexity. We derive the MSE rates for two families of smooth functions; the H\"{o}lder smoothness class and differentiable functions. In particular, it is proved that if the density functions have bounded derivatives up to the order d/2d/2, where dd is the dimension of samples, the optimal parametric MSE rate of O(1/N)O(1/N) can be achieved. The computational complexity is shown to be O(N)O(N), which is optimal. To the best of our knowledge, this is the first empirical divergence estimator that has optimal computational complexity and achieves the optimal parametric MSE estimation rate.Comment: 11 pages, Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (AISTATS) 2018, Lanzarote, Spai

    Node Removal Vulnerability of the Largest Component of a Network

    Full text link
    The connectivity structure of a network can be very sensitive to removal of certain nodes in the network. In this paper, we study the sensitivity of the largest component size to node removals. We prove that minimizing the largest component size is equivalent to solving a matrix one-norm minimization problem whose column vectors are orthogonal and sparse and they form a basis of the null space of the associated graph Laplacian matrix. A greedy node removal algorithm is then proposed based on the matrix one-norm minimization. In comparison with other node centralities such as node degree and betweenness, experimental results on US power grid dataset validate the effectiveness of the proposed approach in terms of reduction of the largest component size with relatively few node removals.Comment: Published in IEEE GlobalSIP 201
    • …
    corecore